Goto

Collaborating Authors

 Somerset County



Do LLMs Really Forget? Evaluating Unlearning with Knowledge Correlation and Confidence Awareness

Wei, Rongzhe, Niu, Peizhi, Hsu, Hans Hao-Hsun, Wu, Ruihan, Yin, Haoteng, Ghassemi, Mohsen, Li, Yifan, Potluru, Vamsi K., Chien, Eli, Chaudhuri, Kamalika, Milenkovic, Olgica, Li, Pan

arXiv.org Artificial Intelligence

Machine unlearning techniques aim to mitigate unintended memorization in large language models (LLMs). However, existing approaches predominantly focus on the explicit removal of isolated facts, often overlooking latent inferential dependencies and the non-deterministic nature of knowledge within LLMs. Consequently, facts presumed forgotten may persist implicitly through correlated information. To address these challenges, we propose a knowledge unlearning evaluation framework that more accurately captures the implicit structure of real-world knowledge by representing relevant factual contexts as knowledge graphs with associated confidence scores. We further develop an inference-based evaluation protocol leveraging powerful LLMs as judges; these judges reason over the extracted knowledge subgraph to determine unlearning success. Our LLM judges utilize carefully designed prompts and are calibrated against human evaluations to ensure their trustworthiness and stability. Extensive experiments on our newly constructed benchmark demonstrate that our framework provides a more realistic and rigorous assessment of unlearning performance. Moreover, our findings reveal that current evaluation strategies tend to overestimate unlearning effectiveness. Our code is publicly available at https://github.com/Graph-COM/Knowledge_Unlearning.git.



Pentagon yet to ask NORTHCOM to intervene against New Jersey drones

FOX News

Rep. Jeff Van Drew, R-N.J., addresses concerns over mystery drones flying over multiple New Jersey counties and details what sources have told him about their origins. The Pentagon has not yet asked U.S. Northern Command to intervene amid reports of mysterious drones witnessed flying over New Jersey, according to a military spokesperson. The large drone sightings have caused concern and confusion as dozens have been reported and officials are at a loss to explain where they come from. Northern Command confirmed some of the drones have been sighted near U.S. military installations. "We are aware and monitoring the reports of unauthorized drone flights in the vicinity of military installations in New Jersey to include Picatinny Arsenal and Naval Weapons Station Earle, and we refer you to those installations for information on any efforts they are may be conducting to ensure the safety and security of their personnel and operations," a U.S. Nothern Command spokesperson told Fox News Digital.


New Jersey leaders speak to DHS as unusual drone sightings now also reported over New York

FOX News

Officials are still investigating unusual drone activity that has been reported in recent weeks in New Jersey. The FAA set temporary restrictions above Trump National Golf Club in Bedminster in response. New Jersey Gov. Phil Murphy said he spoke with state and federal officials about unusual drone activity in parts of the region, including the vicinity of President-elect Trump's Bedminster golf club, but stressed there was no threat to public safety. In a Thursday post on X, Murphy said he convened a briefing with Homeland Security Secretary Alejandro Mayorkas, senior officials from DHS, the state police and state Homeland Security and Preparedness, as well as New Jersey's congressional delegation. "We are actively monitoring the situation and in close coordination with our federal and law enforcement partners on this matter," he wrote.


Optimal Projections for Classification with Naive Bayes

Hofmeyr, David P., Kamper, Francois, Melonas, Michail M.

arXiv.org Machine Learning

In the Naive Bayes classification model the class conditional densities are estimated as the products of their marginal densities along the cardinal basis directions. We study the problem of obtaining an alternative basis for this factorisation with the objective of enhancing the discriminatory power of the associated classification model. We formulate the problem as a projection pursuit to find the optimal linear projection on which to perform classification. Optimality is determined based on the multinomial likelihood within which probabilities are estimated using the Naive Bayes factorisation of the projected data. Projection pursuit offers the added benefits of dimension reduction and visualisation. We discuss an intuitive connection with class conditional independent components analysis, and show how this is realised visually in practical applications. The performance of the resulting classification models is investigated using a large collection of (162) publicly available benchmark data sets and in comparison with relevant alternatives. We find that the proposed approach substantially outperforms other popular probabilistic discriminant analysis models and is highly competitive with Support Vector Machines.


Data-Driven Estimation of the False Positive Rate of the Bayes Binary Classifier via Soft Labels

Jeong, Minoh, Cardone, Martina, Dytso, Alex

arXiv.org Artificial Intelligence

Classification is a fundamental task in many applications on which data-driven methods have shown outstanding performances. However, it is challenging to determine whether such methods have achieved the optimal performance. This is mainly because the best achievable performance is typically unknown and hence, effectively estimating it is of prime importance. In this paper, we consider binary classification problems and we propose an estimator for the false positive rate (FPR) of the Bayes classifier, that is, the optimal classifier with respect to accuracy, from a given dataset. Our method utilizes soft labels, or real-valued labels, which are gaining significant traction thanks to their properties. We thoroughly examine various theoretical properties of our estimator, including its consistency, unbiasedness, rate of convergence, and variance. To enhance the versatility of our estimator beyond soft labels, we also consider noisy labels, which encompass binary labels. For noisy labels, we develop effective FPR estimators by leveraging a denoising technique and the Nadaraya-Watson estimator. Due to the symmetry of the problem, our results can be readily applied to estimate the false negative rate of the Bayes classifier.


$L^1$ Estimation: On the Optimality of Linear Estimators

Barnes, Leighton P., Dytso, Alex, Liu, Jingbo, Poor, H. Vincent

arXiv.org Machine Learning

Consider the problem of estimating a random variable $X$ from noisy observations $Y = X+ Z$, where $Z$ is standard normal, under the $L^1$ fidelity criterion. It is well known that the optimal Bayesian estimator in this setting is the conditional median. This work shows that the only prior distribution on $X$ that induces linearity in the conditional median is Gaussian. Along the way, several other results are presented. In particular, it is demonstrated that if the conditional distribution $P_{X|Y=y}$ is symmetric for all $y$, then $X$ must follow a Gaussian distribution. Additionally, we consider other $L^p$ losses and observe the following phenomenon: for $p \in [1,2]$, Gaussian is the only prior distribution that induces a linear optimal Bayesian estimator, and for $p \in (2,\infty)$, infinitely many prior distributions on $X$ can induce linearity. Finally, extensions are provided to encompass noise models leading to conditional distributions from certain exponential families.


Counterfactual Collaborative Reasoning

Ji, Jianchao, Li, Zelong, Xu, Shuyuan, Xiong, Max, Tan, Juntao, Ge, Yingqiang, Wang, Hao, Zhang, Yongfeng

arXiv.org Artificial Intelligence

Causal reasoning and logical reasoning are two important types of reasoning abilities for human intelligence. However, their relationship has not been extensively explored under machine intelligence context. In this paper, we explore how the two reasoning abilities can be jointly modeled to enhance both accuracy and explainability of machine learning models. More specifically, by integrating two important types of reasoning ability -- counterfactual reasoning and (neural) logical reasoning -- we propose Counterfactual Collaborative Reasoning (CCR), which conducts counterfactual logic reasoning to improve the performance. In particular, we use recommender system as an example to show how CCR alleviate data scarcity, improve accuracy and enhance transparency. Technically, we leverage counterfactual reasoning to generate "difficult" counterfactual training examples for data augmentation, which -- together with the original training examples -- can enhance the model performance. Since the augmented data is model irrelevant, they can be used to enhance any model, enabling the wide applicability of the technique. Besides, most of the existing data augmentation methods focus on "implicit data augmentation" over users' implicit feedback, while our framework conducts "explicit data augmentation" over users explicit feedback based on counterfactual logic reasoning. Experiments on three real-world datasets show that CCR achieves better performance than non-augmented models and implicitly augmented models, and also improves model transparency by generating counterfactual explanations.


CoNIC Challenge: Pushing the Frontiers of Nuclear Detection, Segmentation, Classification and Counting

Graham, Simon, Vu, Quoc Dang, Jahanifar, Mostafa, Weigert, Martin, Schmidt, Uwe, Zhang, Wenhua, Zhang, Jun, Yang, Sen, Xiang, Jinxi, Wang, Xiyue, Rumberger, Josef Lorenz, Baumann, Elias, Hirsch, Peter, Liu, Lihao, Hong, Chenyang, Aviles-Rivero, Angelica I., Jain, Ayushi, Ahn, Heeyoung, Hong, Yiyu, Azzuni, Hussam, Xu, Min, Yaqub, Mohammad, Blache, Marie-Claire, Piégu, Benoît, Vernay, Bertrand, Scherr, Tim, Böhland, Moritz, Löffler, Katharina, Li, Jiachen, Ying, Weiqin, Wang, Chixin, Kainmueller, Dagmar, Schönlieb, Carola-Bibiane, Liu, Shuolin, Talsania, Dhairya, Meda, Yughender, Mishra, Prakash, Ridzuan, Muhammad, Neumann, Oliver, Schilling, Marcel P., Reischl, Markus, Mikut, Ralf, Huang, Banban, Chien, Hsiang-Chin, Wang, Ching-Ping, Lee, Chia-Yen, Lin, Hong-Kun, Liu, Zaiyi, Pan, Xipeng, Han, Chu, Cheng, Jijun, Dawood, Muhammad, Deshpande, Srijay, Bashir, Raja Muhammad Saad, Shephard, Adam, Costa, Pedro, Nunes, João D., Campilho, Aurélio, Cardoso, Jaime S., S, Hrishikesh P, Puthussery, Densen, G, Devika R, C, Jiji V, Zhang, Ye, Fang, Zijie, Lin, Zhifan, Zhang, Yongbing, Lin, Chunhui, Zhang, Liukun, Mao, Lijian, Wu, Min, Vo, Vi Thi-Tuong, Kim, Soo-Hyung, Lee, Taebum, Kondo, Satoshi, Kasai, Satoshi, Dumbhare, Pranay, Phuse, Vedant, Dubey, Yash, Jamthikar, Ankush, Vuong, Trinh Thi Le, Kwak, Jin Tae, Ziaei, Dorsa, Jung, Hyun, Miao, Tianyi, Snead, David, Raza, Shan E Ahmed, Minhas, Fayyaz, Rajpoot, Nasir M.

arXiv.org Artificial Intelligence

Nuclear detection, segmentation and morphometric profiling are essential in helping us further understand the relationship between histology and patient outcome. To drive innovation in this area, we setup a community-wide challenge using the largest available dataset of its kind to assess nuclear segmentation and cellular composition. Our challenge, named CoNIC, stimulated the development of reproducible algorithms for cellular recognition with real-time result inspection on public leaderboards. We conducted an extensive post-challenge analysis based on the top-performing models using 1,658 whole-slide images of colon tissue. With around 700 million detected nuclei per model, associated features were used for dysplasia grading and survival analysis, where we demonstrated that the challenge's improvement over the previous state-of-the-art led to significant boosts in downstream performance. Our findings also suggest that eosinophils and neutrophils play an important role in the tumour microevironment. We release challenge models and WSI-level results to foster the development of further methods for biomarker discovery.